Project Plan Object Recognition with Videos
نویسنده
چکیده
Deep Convolutional Neural Networks (CNNs) have demonstrated great performance on object detection and classification on still images. However, due to pose change, scaling and other complexities, object detectors and classifiers trained on still images cannot achieve comparable performance when directly applied to videos. In this project, we choose the baseline framework to be T-CNN [1], a deep learning framework combining object detection and object tracking and incorporating information unique to videos such as temporal and contextual information. In an attempt to improve the performance of object recognition with videos, this project will explore different modifications to the baseline framework, such as bounding box perturbation, different input features, data configuration and augmentation, and changing and merging of components. The performance of the enhanced framework will be evaluated on ImageNet [2] ILSVRC2015 [14] VID dataset, using the mean Averaged Precision (AP) as the evaluation metric.
منابع مشابه
COMP 4801 Final Report
Object recognition is an interesting task in computer vision with a wide range of real world applications. While object recognition in still images has achieved impressive performance, object recognition in videos is yet to be explored. Due to motion blur and other complexities, still image methods directly applied to videos frames usually cannot yield satisfying results, which calls for the ne...
متن کاملVideo Object Recognition and Modeling by SIFT Matching Optimization
In this paper we present a novel technique for object modeling and object recognition in video. Given a set of videos containing 360 degrees views of objects we compute a model for each object, then we analyze short videos to determine if the object depicted in the video is one of the modeled objects. The object model is built from a video spanning a 360 degree view of the object taken against ...
متن کاملVisor: Video Surveillance Online Repository
Aim of the Visor Project [1] is to gather and make freely available a repository of surveillance and video footages for the research community on pattern recognition and multimedia retrieval. The goal is to create an open forum and a free repository to exchange, compare and discuss results of many problems in video surveillance and retrieval. Together with the videos, the repository contains me...
متن کاملRapid 3d Object Recognition for Automatic Project Progress Monitoring Using a Stereo Vision System
Progress monitoring, a critical tool for successful construction projects, provides information on the present state of construction, which can then be compared with the original plan. The comparison can be used in decision-making to control variations in project performance. However, current methods for acquiring and updating information on the progress of a project using digital cameras and l...
متن کامل